Skip to content

Conversation

@ColinLeeo
Copy link
Contributor

@ColinLeeo ColinLeeo commented Jan 14, 2026

def dataframe_to_tsfile(dataframe: pd.DataFrame,
                        file_path: str,
                        table_name: Optional[str] = None,
                        time_column: Optional[str] = None,
                        tag_column: Optional[list[str]] = None,
                        ):
    """
    Write a pandas DataFrame to a TsFile by inferring the table schema from the DataFrame.
    This function automatically infers the table schema based on the DataFrame's column
    names and data types, then writes the data to a TsFile.
    Parameters
    ----------
    dataframe : pd.DataFrame
        The pandas DataFrame to write to TsFile.
        - If a 'time' column (case-insensitive) exists, it will be used as the time column.
        - Otherwise, the DataFrame index will be used as timestamps.
        - All other columns will be treated as data columns.
    file_path : str
        Path to the TsFile to write. Will be created if it doesn't exist.
    table_name : Optional[str], default None
        Name of the table. If None, defaults to "table".
    time_column : Optional[str], default None
        Name of the time column. If None, will look for a column named 'time' (case-insensitive),
        or use the DataFrame index if no 'time' column is found.
    tag_column : Optional[list[str]], default None
        List of column names to be treated as TAG columns. All other columns will be FIELD columns.
        If None, all columns are treated as FIELD columns.
    Returns
    -------
    None
    Raises
    ------
    ValueError
        If the DataFrame is empty or has no data columns.
    """

@ColinLeeo ColinLeeo force-pushed the support_dataframe_to_tsfile branch from d6aa3e1 to 8c93df1 Compare January 14, 2026 14:37
@codecov-commenter
Copy link

codecov-commenter commented Jan 14, 2026

Codecov Report

❌ Patch coverage is 0% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 61.80%. Comparing base (052ff6b) to head (e625b76).

Files with missing lines Patch % Lines
cpp/src/cwrapper/tsfile_cwrapper.cc 0.00% 3 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop     #706      +/-   ##
===========================================
- Coverage    61.80%   61.80%   -0.01%     
===========================================
  Files          709      709              
  Lines        40375    40378       +3     
  Branches      5686     5687       +1     
===========================================
  Hits         24954    24954              
- Misses       14723    14726       +3     
  Partials       698      698              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ColinLeeo ColinLeeo force-pushed the support_dataframe_to_tsfile branch from 8c93df1 to 46de126 Compare January 14, 2026 15:09
@ColinLeeo ColinLeeo requested a review from jt2594838 January 14, 2026 15:15
@ColinLeeo ColinLeeo force-pushed the support_dataframe_to_tsfile branch 2 times, most recently from 9d28d1f to 9e1edf5 Compare January 16, 2026 03:41
@ColinLeeo ColinLeeo force-pushed the support_dataframe_to_tsfile branch from 9e1edf5 to 1b3c275 Compare February 8, 2026 20:08
Comment on lines +694 to +696
if (cur_schema->column_category == TIME) {
continue;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you skip the time column, it will be missing when the file is loaded into IoTDB.

Comment on lines +134 to +141
df_read = to_dataframe(tsfile_path, table_name="test_table")
df_read = df_read.sort_values('time').reset_index(drop=True)
df_sorted = convert_to_nullable_types(df.sort_values('timestamp').reset_index(drop=True))

assert df_read.shape == (30, 3)
assert df_read["time"].equals(df_sorted["timestamp"])
assert df_read["device"].equals(df_sorted["device"])
assert df_read["value"].equals(df_sorted["value"])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name of the time column should be prevserved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants